Add audio text to text #1691

Dhiraj309 · 2025-08-17T14:02:30Z

This PR introduces a new audio-text-to-text task to the huggingface.js library. It enables converting audio input into text using automatic speech recognition (ASR) models.

Key updates:

Added packages/tasks/src/tasks/audio-text-to-text/ with:

data.ts – metadata for datasets, models, metrics, and demo.

inference.ts – logic for converting audio to text.

about.md – task description.

spec/input.json & spec/output.json – example inputs and expected outputs.

Task summary: "Convert audio input into text using speech-to-text (ASR) models."

Demonstration includes a sample .wav file and expected transcription output.

⚠️ Note: This task is currently not integrated into the main pipeline or automated tests. Manual testing is recommended before pipeline inclusion.

Vaibhavs10

Apologies for the delay on this PR @Dhiraj309 - do note that there was an existing PR for this task already: #1212

We have since decided to converge on #1692 instead.

Thank you again for your contribution 🙏 (we'll proceed to close this)

cc: @Deep-unlearning @merveenoyan

Dhiraj309 added 2 commits August 17, 2025 15:35

Add audio-text-to-text task (speech-to-text)

a9d8076

Add audio-text-to-text task (speech-to-text)

c34b22e

Dhiraj309 requested review from SBrandeis, gary149, Wauplin, julien-c, pcuenca and ngxson as code owners August 17, 2025 14:02

Dhiraj309 added 5 commits August 18, 2025 12:16

feat(tasks): add audio-text-to-text task definition

e613750

Merge branch 'main' into add-audio-text-to-text

c45c8c8

Merge branch 'main' into add-audio-text-to-text

c85f637

Merge branch 'main' into add-audio-text-to-text

1190be1

Merge branch 'main' into add-audio-text-to-text

d4627be

Vaibhavs10 reviewed Aug 31, 2025

View reviewed changes

Vaibhavs10 closed this Aug 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add audio text to text #1691

Add audio text to text #1691

Uh oh!

Dhiraj309 commented Aug 17, 2025

Uh oh!

Vaibhavs10 left a comment

Uh oh!

Uh oh!

Add audio text to text #1691

Add audio text to text #1691

Uh oh!

Conversation

Dhiraj309 commented Aug 17, 2025

Uh oh!

Vaibhavs10 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!